Speech Produced in Noise Doctoral Thesis
نویسندگان
چکیده
When exposed to noise, speakers modify the way they speak, possibly in an effort to maintain intelligible communication. These modifications are collectively referred to as the Lombard effect. The work described in this thesis compares speech production changes induced by noise with various spectral and temporal characteristics, and explores the perceptual consequence of these changes. The thesis consists of a series of experimental studies, which involve the analysis of speech corpora collected under different noise conditions, with and without a communicative task. Intelligibility is also measured and predicted using a computer model. The first study concerns the acoustic and phonetic consequences of N-talker “babble” noise on sentence production for a range of values of N from 1 (competing talker) to “infinity” (speech-shaped noise). The effect of noise on speech production increased with N and noise level, both of which act to increase the energetic masking effect of the noise. In a background of stationary noise, noise-induced speech was always more intelligible than speech produced in quiet, and the gain in intelligibility increased with N and noise level, suggesting that talkers modify their productions to ameliorate energetic masking at the ears of the listener. The effect of lowand high-pass filtered noise on speech production was also examined to address the issue of whether speakers can compensate for energetic masking by actively shifting their spectral energy to regions least affected by the noise. Little evidence was found that speakers are able to modify their speech production to take advantage of those spectral regions clear of noise. To evaluate the origin of the increased intelligibility of Lombard speech, the fundamental frequency and spectral tilt of speech produced in quiet were artificially manipulated to match those of speech produced in speech-shaped noise. A perceptual evaluation showed that spectral flattening made a larger contribution to Lombard speech intelligibility, but failed to find an influence of an increase in fundamental frequency. A computational modeling study indicated that durational changes could also play an important role in increasing intelligibility. These findings suggest that speech modifications which reallocate energy in time and frequency to introduce more “glimpses” of clean speech in the presence of noise are able to contribute to speech intelligibility. An analysis of the effect of noise on speech production requires material recorded while undertaking realistic tasks. The effect of a communication factor was explored using conversational speech collected in the presence of maskers with differing degrees of energetic and informational masking potential. The size of speech production changes was found to scale with the energetic masking potential of background noise, extending the findings with read speech to a communicative task. In addition, relative to the non-communicative task, speakers exploited temporal planning to reduce the amount of overlap with a modulated background noise, an effect which was stronger when the noise contained intelligible speech. In conclusion, the strategies used by talkers to promote successful speech communication under various noise conditions reported in this thesis could enable spoken output applications such as dialogue systems to adapt to communicational environment.
منابع مشابه
Doctoral Dissertation Blind Speech Enhancement with Independent Component Analysis and Spectral Subtraction
A hands-free speech recognition system and a hands-free telecommunication system are essential for realizing an intuitive, unconstrained, and stress free human-machine interface. In an actual acoustic environment, however, not only user’s speech but also interference source signals such as background noise and interference speech are existing. Such interferences disturb high-quality speech reco...
متن کاملDoctoral Dissertation Blind Source Separation Based on Multistage Independent Component Analysis
A hands-free speech recognition system and a hands-free telecommunication system are essential for realizing an intuitive, unconstrained, and stress-free human-machine interface. In real acoustic environments, however, the speech recognition performance and a speech recording performance significantly degraded because we cannot detect the user’s speech with a high signal-to-noise ratio (SNR) ow...
متن کاملA Comparative Study of the Defense of Nursing PhD Thesis in Iran and Top United States Universities
Background : The most important event in the doctoral course is the completion and defense of the dissertation, which leads to learning and improving the necessary skills to conduct research and improve performance in the field. Evaluating a doctoral dissertation defense program helps to identify the strengths and weaknesses of this process. Therefore, this comparative study has investigated th...
متن کاملStatistical methods for incomplete speech data
Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi Author Ulpu Remes Name of the doctoral dissertation Statistical methods for incomplete speech data Publisher School of Electrical Engineering Unit Department of Signal Processing and Acoustics Series Aalto University publication series DOCTORAL DISSERTATIONS 149/2016 Field of research Speech and Language Technology Manuscript submitt...
متن کاملResearch as transition instrument: A phenomenological investigation of future image in Ph.D. thesis writing
This research has been done to investigate Shiraz university doctoral students’ perspectives on thesis writing. Required data has been gathered by using deep interviews with eight doctoral students. Based on an abductive research strategy and using interpretative phenomenology, the research findings show a Ph.D. thesis doesn’t have a place in the big picture of their life. Themes abstracted fro...
متن کامل